28 research outputs found

    MetQy – an R package to query metabolic functions of genes and genomes

    Get PDF
    Summary: With the rapid accumulation of sequencing data from genomic and metagenomic studies, there is an acute need for better tools that facilitate their analyses against biological functions. To this end, we developed MetQy, an open–source R package designed for query–based analysis of functional units in [meta]genomes and/or sets of genes using the The Kyoto Encyclopedia of Genes and Genomes (KEGG). Furthermore, MetQy contains visualisation and analysis tools and facilitates KEGG’s flat file manipulation. Thus, MetQy enables better understanding of metabolic capabilities of known genomes or user–specified [meta]genomes by using the available information and can help guide studies in microbial ecology, metabolic engineering and synthetic biology. Availability and Implementation: The MetQy R package is freely available and can be downloaded from our group’s website (http://osslab.lifesci.warwick.ac.uk) or GitHub (https://github.com/OSS-Lab/MetQy)

    Tool development for metabolic analyses in the context of thermodynamic constraints

    Get PDF
    Metabolism is key to all biological processes. Studies have yet to establish the link between metabolism, extracellular electron transfer and thermodynamics. In this context, I successfully developed a platform to enable electrochemical experiments using strict anaerobic microorganisms to quantify the electron transfer in an effort to measure metabolic rates. This involved establishing a novel hypothesis to investigate “syntrophy over wires”. Moreover, I developed a computational tool, MetQy, to enable the automated, large–scale analysis of annotated genomes with metabolic information in the form of an R package. The work I presented here has paved the way for electrochemical and computational analyses towards characterising and better understanding metabolic–electronic interactions in the context of thermodynamics

    An improved machine learning pipeline for urinary volatiles disease detection:Diagnosing diabetes

    Get PDF
    Motivation The measurement of disease biomarkers in easily–obtained bodily fluids has opened the door to a new type of non–invasive medical diagnostics. New technologies are being developed and fine–tuned in order to make this possibility a reality. One such technology is Field Asymmetric Ion Mobility Spectrometry (FAIMS), which allows the measurement of volatile organic compounds (VOCs) in biological samples such as urine. These VOCs are known to contain a range of information on the relevant person’s metabolism and can in principle be used for disease diagnostic purposes. Key to the effective use of such data are well–developed data processing pipelines, which are necessary to extract the most useful data from the complex underlying biological structure. Results In this study, we present a new data analysis pipeline for FAIMS data, and demonstrate a number of improvements over previously used methods. We evaluate the effect of a series of candidate operational steps during data processing, such as the use of wavelet transforms, principal component analysis (PCA), and classifier ensembles. We also demonstrate the use of FAIMS data in our pipeline to diagnose diabetes on the basis of a simple urine sample using machine learning classifiers. We present results for data generated from a case-control study of 115 urine samples, collected from 72 type II diabetic patients, with 43 healthy volunteers as negative controls. The resulting pipeline combines the steps that resulted in the best classification model performance. These include the use of a two–dimensional discrete wavelet transform, and the Wilcoxon rank–sum test for feature selection. We are able to achieve a best ROC curve AUC of 0.825 (0.747–0.9, 95% CI) for classification of diabetes vs control. We also note that this result is robust to changes in the data pipeline and different analysis runs, with AUC > 0.80 achieved in a range of cases. This is a substantial improvement in performance over previously used data processing methods in this area. Our ability to make strong statements about FAIMS ability to diagnose diabetes is sadly limited, as we found confounding effects from the demographics when including these data in the pipeline. The demographics alone produced a best AUC of 0.87 (0.795–0.94, 95% CI). While the combination of the demographics and FAIMS data resulted in an improvement on the AUC (0.907; 0.848–0.97, 95% CI), it did not prove to be a significant difference. Nevertheless, the pipeline itself shows a significant improvement in performance over more basic methods which have been used with FAIMS data in the past

    Analysis of shared heritability in common disorders of the brain

    Get PDF
    ience, this issue p. eaap8757 Structured Abstract INTRODUCTION Brain disorders may exhibit shared symptoms and substantial epidemiological comorbidity, inciting debate about their etiologic overlap. However, detailed study of phenotypes with different ages of onset, severity, and presentation poses a considerable challenge. Recently developed heritability methods allow us to accurately measure correlation of genome-wide common variant risk between two phenotypes from pools of different individuals and assess how connected they, or at least their genetic risks, are on the genomic level. We used genome-wide association data for 265,218 patients and 784,643 control participants, as well as 17 phenotypes from a total of 1,191,588 individuals, to quantify the degree of overlap for genetic risk factors of 25 common brain disorders. RATIONALE Over the past century, the classification of brain disorders has evolved to reflect the medical and scientific communities' assessments of the presumed root causes of clinical phenomena such as behavioral change, loss of motor function, or alterations of consciousness. Directly observable phenomena (such as the presence of emboli, protein tangles, or unusual electrical activity patterns) generally define and separate neurological disorders from psychiatric disorders. Understanding the genetic underpinnings and categorical distinctions for brain disorders and related phenotypes may inform the search for their biological mechanisms. RESULTS Common variant risk for psychiatric disorders was shown to correlate significantly, especially among attention deficit hyperactivity disorder (ADHD), bipolar disorder, major depressive disorder (MDD), and schizophrenia. By contrast, neurological disorders appear more distinct from one another and from the psychiatric disorders, except for migraine, which was significantly correlated to ADHD, MDD, and Tourette syndrome. We demonstrate that, in the general population, the personality trait neuroticism is significantly correlated with almost every psychiatric disorder and migraine. We also identify significant genetic sharing between disorders and early life cognitive measures (e.g., years of education and college attainment) in the general population, demonstrating positive correlation with several psychiatric disorders (e.g., anorexia nervosa and bipolar disorder) and negative correlation with several neurological phenotypes (e.g., Alzheimer's disease and ischemic stroke), even though the latter are considered to result from specific processes that occur later in life. Extensive simulations were also performed to inform how statistical power, diagnostic misclassification, and phenotypic heterogeneity influence genetic correlations. CONCLUSION The high degree of genetic correlation among many of the psychiatric disorders adds further evidence that their current clinical boundaries do not reflect distinct underlying pathogenic processes, at least on the genetic level. This suggests a deeply interconnected nature for psychiatric disorders, in contrast to neurological disorders, and underscores the need to refine psychiatric diagnostics. Genetically informed analyses may provide important "scaffolding" to support such restructuring of psychiatric nosology, which likely requires incorporating many levels of information. By contrast, we find limited evidence for widespread common genetic risk sharing among neurological disorders or across neurological and psychiatric disorders. We show that both psychiatric and neurological disorders have robust correlations with cognitive and personality measures. Further study is needed to evaluate whether overlapping genetic contributions to psychiatric pathology may influence treatment choices. Ultimately, such developments may pave the way toward reduced heterogeneity and improved diagnosis and treatment of psychiatric disorders

    IMPACT-Global Hip Fracture Audit: Nosocomial infection, risk prediction and prognostication, minimum reporting standards and global collaborative audit. Lessons from an international multicentre study of 7,090 patients conducted in 14 nations during the COVID-19 pandemic

    Get PDF

    The Crowdsourced Replication Initiative: Investigating Immigration and Social Policy Preferences. Executive Report.

    Get PDF
    In an era of mass migration, social scientists, populist parties and social movements raise concerns over the future of immigration-destination societies. What impacts does this have on policy and social solidarity? Comparative cross-national research, relying mostly on secondary data, has findings in different directions. There is a threat of selective model reporting and lack of replicability. The heterogeneity of countries obscures attempts to clearly define data-generating models. P-hacking and HARKing lurk among standard research practices in this area.This project employs crowdsourcing to address these issues. It draws on replication, deliberation, meta-analysis and harnessing the power of many minds at once. The Crowdsourced Replication Initiative carries two main goals, (a) to better investigate the linkage between immigration and social policy preferences across countries, and (b) to develop crowdsourcing as a social science method. The Executive Report provides short reviews of the area of social policy preferences and immigration, and the methods and impetus behind crowdsourcing plus a description of the entire project. Three main areas of findings will appear in three papers, that are registered as PAPs or in process

    Observing many researchers using the same data and hypothesis reveals a hidden universe of uncertainty.

    Get PDF
    This study explores how researchers' analytical choices affect the reliability of scientific findings. Most discussions of reliability problems in science focus on systematic biases. We broaden the lens to emphasize the idiosyncrasy of conscious and unconscious decisions that researchers make during data analysis. We coordinated 161 researchers in 73 research teams and observed their research decisions as they used the same data to independently test the same prominent social science hypothesis: that greater immigration reduces support for social policies among the public. In this typical case of social science research, research teams reported both widely diverging numerical findings and substantive conclusions despite identical start conditions. Researchers' expertise, prior beliefs, and expectations barely predict the wide variation in research outcomes. More than 95% of the total variance in numerical results remains unexplained even after qualitative coding of all identifiable decisions in each team's workflow. This reveals a universe of uncertainty that remains hidden when considering a single study in isolation. The idiosyncratic nature of how researchers' results and conclusions varied is a previously underappreciated explanation for why many scientific hypotheses remain contested. These results call for greater epistemic humility and clarity in reporting scientific findings

    Observing many researchers using the same data and hypothesis reveals a hidden universe of uncertainty

    No full text
    This study explores how researchers' analytical choices affect the reliability of scientific findings. Most discussions of reliability problems in science focus on systematic biases. We broaden the lens to emphasize the idiosyncrasy of conscious and unconscious decisions that researchers make during data analysis. We coordinated 161 researchers in 73 research teams and observed their research decisions as they used the same data to independently test the same prominent social science hypothesis: that greater immigration reduces support for social policies among the public. In this typical case of social science research, research teams reported both widely diverging numerical findings and substantive conclusions despite identical start conditions. Researchers' expertise, prior beliefs, and expectations barely predict the wide variation in research outcomes. More than 95% of the total variance in numerical results remains unexplained even after qualitative coding of all identifiable decisions in each team's workflow. This reveals a universe of uncertainty that remains hidden when considering a single study in isolation. The idiosyncratic nature of how researchers' results and conclusions varied is a previously underappreciated explanation for why many scientific hypotheses remain contested. These results call for greater epistemic humility and clarity in reporting scientific findings

    Observing Many Researchers Using the Same Data and Hypothesis Reveals a Hidden Universe of Uncertainty

    No full text
    Breznau N, Rinke EM, Wuttke A, et al. Observing Many Researchers Using the Same Data and Hypothesis Reveals a Hidden Universe of Uncertainty. 2021.This study explores how researchers’ analytical choices affect the reliability of scientific findings. Most discussions of reliability problems in science focus on systematic biases. We broaden the lens to include conscious and unconscious decisions that researchers make during data analysis and that may lead to diverging results. We coordinated 161 researchers in 73 research teams and observed their research decisions as they used the same data to independently test the same prominent social science hypothesis: that greater immigration reduces support for social policies among the public. In this typical case of research based on secondary data, we find that research teams reported widely diverging numerical findings and substantive conclusions despite identical start conditions. Researchers’ expertise, prior beliefs, and expectations barely predicted the wide variation in research outcomes. More than 90% of the total variance in numerical results remained unexplained even after accounting for research decisions identified via qualitative coding of each team’s workflow. This reveals a universe of uncertainty that is hidden when considering a single study in isolation. The idiosyncratic nature of how researchers’ results and conclusions varied is a new explanation for why many scientific hypotheses remain contested. It calls for greater humility and clarity in reporting scientific findings
    corecore